INTRODUCTION

From the Berkeley Earth Data page, this dataset in made up or temperature recordings from the Earth’s surface.

The data ranges from November 1st, 1743 to December 1st, 2013. The dataset files used are:

GlobalLandTemperaturesByMajorCity GlobalLandTemperaturesByState.

And also Continents data to be integrated to aid in attainment of objectives, which is from Machin github.

GlobalLandTemperaturesByMajorCity

This data is made of columns of date, Country, Average temperature, temperature uncertainty,City, Latitude and Longitude.

GlobalLandTemperaturesByState

This data is made of columns of date, Country, Average temperature, temperature uncertainty,State.

Continents data

It contains data of columns name, alpha-2, alpha-3, region, region code, sub-region and ISO code.

The goals of the undertaking;

  • Maximum temperature among the continents.
  • Average temperature among the continents.
  • The maximum and average temperature recorded by a continent.
  • Temperature differences among countries.
  • Also, a focus on Italy and the city Rome
  • And, other expressiveness including timeseries.
  • I have chosen to isolate Rome and consider the data of that city to be my dataset. But, abit infer look on Italy.

    Forecasting

    Facebook Prophet was used in the making of the forecast

    EXPLORING THE DATA DISTRIBUTION

    Each of the estimates covered above on the data is in a single number to describe the location or variability of the data. It is also useful to explore how the data is distributed overall.

    Boxplots are based on percentiles and give a quick way to visualize the distribution of data. The median is shown by the horizontal line in the box. The dashed lines, referred to as whiskers, extend from the top and bottom to indicate the range for the bulk of the data.

    Next on Global temperature by state

    EXPLORING THE DATA DISTRIBUTION

    Each of the estimates covered above on the data is in a single number to describe the location or variability of the data. It is also useful to explore how the data is distributed overall.

    Boxplots are based on percentiles and give a quick way to visualize the distribution of data. The median is shown by the horizontal line in the box. The dashed lines, referred to as whiskers, extend from the top and bottom to indicate the range for the bulk of the data.

    Here it can be noted that the test statistic is greater than the critical value. therefore have failed to reject the null hypothesis at this point.Hence, time series is not stationary.

    Here it can be noted that the test statistics is less than the critical, therefore can reject the null hypoythsis, and confident the data is stationary.